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Abstract. Five different methods which can be used to analytically calculate entropies 
that are nonconcave as functions of the energy in the thermodynamic limit are discussed 
and compared. The five methods are based on the following ideas and techniques: 
i) microcanonical contraction, ii) metastable branches of the free energy, iii) generalized 
canonical ensembles with specific illustrations involving the so-called Gaussian and 
Betrag ensembles, iv) restricted canonical ensemble, and v) inverse Laplace transform. 
A simple long-range spin model having a nonconcave entropy is used to illustrate each 
method. 



PACS numbers: 64.60.-i, 05.20.-y, 05.20.Gg 
1. Introduction 

The recent study of many-body systems interacting via long-range potentials, such as 
gravitating particles or unscreened plasmas (see (l]-[4) for other examples), has revealed 
an interesting property of the entropy that went unnoticed for a surprisingly long time, 
namely, that it can be nonconcave as a function of the energy in the thermodynamic 
limit. It was known before this discovery that the entropy of finite-size systems could be 
nonconcave because of boundary or surface contributions (see, e.g., (5}{7] and gJjlO] for 



implicit references to this idea in the context of finite-size scaling). But, in almost all 
cases, it was assumed that this nonconcavity disappears when taking the thermodynamic 
limit because the "bulk" entropy, which is supposedly always concave as a function of 
energy, dominates over the "surface" entropy. 

We now know that the situation is more complicated and, at the same time, more 
interesting. If the interaction in a homogeneous many-particle system is short-range 
(see [3] for a definition), then the thermodynamic entropy of this system is essentially 



always concave as a function of its energy, as has been proved years ago by Ruelle 11 
(see also Lanford [12]) using a separation or so-called "subadditivity" argument. However, 
if the interaction is long-range, then the subadditivity argument does not work, and 
the entropy can be either concave or nonconcave [l}j3]- The latter possibility has for 
some time been known to arise in mean-field systems, but the crucial and relatively 
recent input that the study of systems such as gravitating particles has provided is 
that nonconcave entropies also arise in systems involving physical interactions that are 
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genuinely "long-range" . In this sense, nonconcave entropies cannot be dismissed as 
an artifact of mean-field approximations — they are "physical". In fact, it is by now 
established that the nonconcavity of the entropy is related to many interesting physical 
phenomena, including 

• the existence of energy regions where the heat capacity, defined microcanonically, is 
negative (something forbidden in the canonical ensemble) [I}j3| [l3[jl5] ; 

• the appearance of first-order phase transitions as well as metastable states in the 
canonical ensemble [3, 16 , 17 ; 



the nonequivalence of the microcanonical and canonical ensembles at the 



thermodynamic and equilibrium macrostate levels 18 19 ; 
• a possible ergodicity breaking in microcanonical dynamics 20 . 

We will not be directly concerned here with any of these phenomena; instead, we 
will consider a problem of a more technical nature having to do with how entropies 
that are nonconcave can be calculated in practice. Our starting point is the age-old 
thermodynamic result stating that the entropy of a thermodynamic system is the 
Legendre transform of its free energy, and vice versa. This duality property can only be 
true, obviously, if the entropy is concave, since Legendre transforms only yield concave 
functions. Hence, if one knows or suspects that the entropy of a system is nonconcave, 
then this entropy cannot be obtained from the canonical ensemble by calculating the 
Legendre transform of the free energy. How can the entropy be calculated then? 

The goal of this paper is to describe and illustrate in the simplest way possible the 
various methods that have been proposed in the last few years to answer this question. 
Five methods, which cover the latest studies on the topic of nonconcave entropies, will 
be covered: 

(i) Microcanonical contraction (Sec. [3]); 

(ii) Metastable branches of the canonical free energy (Sec. [1]); 

(iii) Generalized canonical ensembles, with special emphasis on the Gaussian ensemble 
and a new ensemble called the Betrag ensemble (Sec. [5]); 

(iv) Restricted canonical ensemble (Sec. [6]); 

(v) Inverse Laplace transform (Sec. [7]). 

Each of these methods will be illustrated with a simple spin system introduced 
in 21 22 as a pedagogical model of equilibrium statistical mechanics having a nonconcave 
entropy. The model, as will become clear, is not meant to represent any real physical 
system, but has the advantage of being exactly solvable, which makes it useful for 
demonstrating how the methods listed above work in practice, and for illustrating along 
the way many general results about nonconcave entropies. 

In theory, all of the methods that will be discussed can be used to calculate any 
nonconcave entropy, but we will see that some may be more effective or more "tractable" 
than others in practice, depending on the system considered. The question of selecting 
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the "right" method for a given system will be discussed at the end of the paper, along 
with some open problems related to generalized canonical ensembles. 

2. Canonical ensemble 

Before we start discussing methods that can be used to calculate nonconcave entropies, 
let us convince ourselves that the Legendre transform of the canonical free energy does 
not yield the micro canonical entropy when the latter is nonconcave. This will give us 
the opportunity to introduce the basic notations used in this paper. 

Let Hn(u) be the Hamiltonian of a classical iV-particle system, and let u denote a 
configuration or microstate of this system, and A^r its configuration space. We define the 
thermodynamic free energy or free energy density of the canonical ensemble by the limit 

<p(P)=Km --j-lnZjrOS), (1) 

where 

Z N (p) = I e-^ (w) du (2) 

J A N 

is the iV-particle partition function. The problem that concerns us here is to determine 
whether ip(/3) can be used to obtain the thermodynamic entropy or entropy density of 
the microcanonical ensemble, defined by 

s(u) = Km — \nn N (u), (3) 

N— >oo iv 

where 



&n(u) — / duj = 5(hN(uj) — u) du (4) 



x 



is the density of states, which gives the volume (or, more pictorially, the number) of 
microstates u that have a mean energy h N {uj) = H n {uj)/N equal to u. 

As mentioned in the introduction, the common answer to this problem given by 



most thermodynamics textbooks (see 23 for an exception) is that s(u) can always be 
obtained as the Legendre transform of (p(J3), since ip(/3) and s(u) are Legendre transforms 
of one another. This implies, incidentally, that one can also always calculate <p(/3) as 
the Legendre transform of s(u). But is this really the complete answer? What if (f(f3) 
is nondifferentiable, as is the case when there is a first-order phase transition in the 
canonical ensemble? How does one define the Legendre transform for this case? Also, 
what if s(u) is nonconcave? 

The latter question naturally arises when studying long-range systems. If s(u) is 
nonconcave, then the Legendre transform of ip(/3) cannot yield s(u) simply because 



Legendre transforms only yield concave functions [24 25 . By reading a bit about convex 
analysis, one learns in fact that the Legendre transform of (p(/3), defined as 

¥>•(«) =mf{/?t* -¥>(/?)}, (5) 
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yields a concave function corresponding in general to the concave envelope of s(«)jf] 
Thus, if s(u) is concave, then s(u) = tp*(u), which is to say that s(u) is the Legendre 
transform of tp({3). However, if s(u) is nonconcave, then s(u) 7^ f*(u)- In this case, only 
the concave part of s(u) will be recovered from the Legendre transform <p*(u) of ip((3). 

These mathematical results are illustrated in the next example using a simple spin 
model which will stay with us for the rest of the paper. For proofs of these results, the 



reader should consult the classical book of Rockafellar 24 or the more readable treatise 
of van Tiel [25l. 



Example 2.1 (Block-spin model |22|). Consider the following Hamiltonian: 

N £S 

H N = —y + 2_^o- h (6) 

where y and <7j, i = 1,2,..., N/2, are spin variables taking values in the set { — 1, +1}. 
The first term in the Hamiltonian represents the energy of a block of N/2 "frozen" spins 
constrained to take the same spin value y (N is assumed to be even). The second term 
represents the energy of a second block of N/2 "free" spins which do not interact with 
each other nor with the first block of spins. 

The entropy density s(u) of this spin system can easily be calculated from the 
definition of this quantity, i.e., from Eqs. ^ and Q. This calculation is presented in 22 
with the result 

l(s a (2u + l) ue[-l,0) 
[) 2\s a (2u-l) ue[0,l], [) 

where 



is the entropy of the "free" spins <jj. As is clear from Fig. [T| s(u) is a nonconcave function 
of u as it has more than one maximum^] and its derivative is non-monotonic. 

To verify that this nonconcave entropy cannot be obtained from (f(/3), we proceed 
to calculate <p*(u). The direct evaluation of Z^{f3) yields for this model 



so that 



l N (j3) = (e^ 2 + e"W 2 )(e /3 + e"T /2 , (9) 



^(/3) = -^i-iln(2cosh/3). (10) 



| The transformation denned Eq. ([5|) is actually a generalization of the Legendre transform known as the 
Legendre-Fenchel transform, which can be applied to nondifferentiable as well as nonconcave functions; 
see 24 25 for more details. In this paper, we refer to this transform as the Legendre transform for 
simplicity. 

§ It is easy to sec that a concave function can only have one maximum. 
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Figure 1. Left: Microcanonical entropy s(u) of the block-spin model defined in 



Example 2.1 Center: Canonical free energy p{f3) of the model. Right: Concave 
envelope of s(u) (blue) obtained from the Legendre transform of ip{/3). 



From this expression, plotted in Fig. [TJ we then compute the Legendre transform defined 
in Eq. (|5]). This calculation can again be found in 22 ; the result is 

( s a (2u + l) u G 

¥»*(«) = 2 S ln2 (li) 

[ S(r (2«-1) «6 (1,1]. 

This function is plotted in Fig. [I] We see, as announced, that (f*(u) is a concave function 
corresponding to the concave envelope of s(u). The part of s(u) that coincides with 
ip*(u) for it G [—1, —1] U [1, 1] is called the concave parts of s(u), whereas the part such 
that s(u) < <f*(u), seen for u G (— |, |), is called the nonconcave part of s(w). 

We can get more insights into the results presented above and illustrated in Fig. [T] 
by noting the following extra results of convex analysis: 

• The entropy s at the point u is equal to the Legendre transform of (p if and only if 
one can place a line above the graph of s(u) that touches s(u) without intersect it. 
When this is possible, we say that s admits a supporting line at u. Mathematically, 
this property is expressed as follows: s = ip* at u if, and only if, there exists G R 
such that 

s{v) < s(u) + 0(v - u) (12) 



for all v. See 19,26| for more details on the concept of supporting lines. 

• The concave envelope or concave hull of s(u) is obtained by constructing the set of 
all the supporting lines of s(u); see Fig. |2fjj| 

• If 99 is differentiable at 0, then 

s(u p ) = (p*(u p ) = (3u p - <p(P), (13) 
where up = <p'(/3)^\ 

|| Mathematically, the concave envelope of s(u) is also given by the smallest concave function that 
majorizes s(u) or by a geometrical construction known as Maxwell's construction; see Sec. 4 of [27] or 
Chap. 3 of (2l|. 

This resultis essentially a consequence of the so-called Gartner-Ellis Theorem of large deviation 
theory; see Sec. 5.2 of [26 
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Figure 2. (Color online) Left: Illustration of the concept of supporting lines: the line 
in blue is supporting but not the line in red. Right: The concave envelope of s(u) is 
given by the set of all supporting lines. 



The first result provides a useful geometric understanding of the points of the 
entropy that can or cannot be obtained from the Legendre transform of the free energy. 
This is illustrated in Fig. |2j As for the third result, it shows that s(u) is correctly given 
by the Legendre transform of ip(/3) for all points u lying in the image of the derivative of 
<p(/3), i.e., all points u such that u = <p'(/3) for some j3 G R. This implies, in particular, 
that if (p(/3) is everywhere differentiable and the image of ip' coincides with the domain 
of s (i.e., the set of allowed or "realizable" values for Hn/N), then s — <p* holds globally. 
We will often come back to this result in the rest of the paper when treating generalized 
canonical ensembles. 



3. Microcanonical contraction 



The block-spin model that we have studied in the previous example is simple enough that 
we can obtain its nonconcave entropy s(u) directly from the definition of this quantity. 
But, of course, for more realistic and hence more complex models, one should not hope 
to be able to obtain s(u) in this way. How can s(u) be calculated then? 

One answer finds its inspiration from large deviation theory, and attempts to derive 
s(u) by maximizing another entropy subject to the energy constraint. The basis and 



hypotheses behind this method are the following 18 . Given the Hamiltonian H^(u), 
one must be able to find a macrostate M^iu) such that the following two properties are 
satisfied: 

• The mean energy h N {u) = H N (u)/N can re-written as a function of M N {u) either 
exactly or asymptotically in the thermodynamic limit A" — > oo. Mathematically, 
this implies that there exists a function h(m) such that 

\h N (u)-h(M N (u))\->0 (14) 

uniformily for all u G A^r as A^ — > oo. The function h(m) is called the energy 
representation function. 

• There exists an entropy function s(m) for M^(u>), which is to say that the following 
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limit exists: 



sim 



lim —\nfl N (M N = m), 

N—>oo Jy 



(15) 



where 



tt N (m)= / du= 5(M N (u) -w) da; (16) 

J u]£Aff'-M]\f(uj)=m J Ajv 

counts the number of microstates such that M n {uj) = m. The function s(m) is 
called the macrostate entropy. 



When both of these conditions are satisfied, it is relatively easy to show (see 21 ,27]) 



that 



s(u 



sup s{m) 

m:h(m)=u 



(17) 



This formula is what we refer to as a microcanonical contraction. The word "contraction" 
comes from the fact that this formula can be derived from a result known in large 



deviation theory as the contraction principle 26 



The same result can also be seen as a form of maximum entropy principle expressing 
s(u) as the constrained maximization of the macrostate entropy s(m). It can be shown 
that the constrained maximizers of s(m) such that h(m) = u correspond physically 
to the equilibrium values of in the microcanonical ensemble with mean energy u. 



By denoting the set of such maximizers by £ u , we can therefore re-express Eq. (17) as 
s(S u ). 



s(u 



Example 3.1. The calculation of the entropy s(u) of the block-spin model via the 



contraction formula of Eq. (17) is presented in [22]. It is easy to see that a natural choice 
of macrostate for this model is m = (y,p), where y is the spin value of the "frozen" block 
of spins, and p is the proportion of +1 spins in the block of "free" spins. In terms of this 
macrostate, we obviously have 

h{y,p) = V -+p- 1 -. (18) 

The macrostate entropy for this choice of macrostate is, up to a 1/2 factor, the Boltzmann- 
Shannon entropy: 



s(p) = ~-p\np- -(1 -p)\n(l -p). (19) 

This entropy does not depend on y because the block of "frozen" spins does not contribute 
to the entropy of the model. We refer the reader to (22] again for the calculation of s{u) 
based on this energy representation function and macrostate entropy. 

Other examples of calculations of entropies based on the microcanonical contraction 



formula include the mean- field Blume- Emery- Griffiths model 27 28 , the mean-field Potts 



model 29 30J (see also Example 5.4 of p26j), the mean- field Hamiltonian model 30 31 



the so-called mean-field <f> 4 model 32 , 33 , as well as a variant of this model having a 



nonconcave entropy s(u,m) as a function of the energy u and magnetization m K34] 
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From this list and the form of Eq. (17), one might conclude that this equation 



is only good for mean-field models, as these are presumably the only models whose 
Hamiltonian can be re-expressed as a function of some specially chosen macrostates or 
"mean-fields". However, this is not the case. In theory, at least, it is always possible to 
express the Hamiltonian of any model, including short-range models, as a function of an 



infinite-dimensional macrostate known as the empirical process (see 35 and Sec. 5.3.4 
of [26]). But given the infinite-dimensional nature of this macrostate, the calculation of 
s(u) from it is typically impractical if not impossible. For this reason, the microcanonical 
contraction formula has mostly, if not only, been used in the context of mean-field and 
long-range models, which are in any case the models for which nonconcave entropies are 
expected to arise. 

4. Metastable branches of the free energy 

The microcanonical contraction formula discussed in the previous section involves a 
constrained maximization problem which can be transformed, following the theory of 
Lagrange multipliers, into an unconstrained maximization by considering the function 

Gp(m) = s(m) -ph(m), (20) 

which involves a Lagrange multiplier (3 associated with the constraint h(m) = u. The 
question that we ask in this section is: Can we obtain the set of constrained (global) 
maximizers of s(m) with h(m) = u, which was denoted in the previous section by E u , 
from the set Ep of (global) unconstrained maximizers of the new function Gp{m)l 

The answer is, no, at least if s(u) is nonconcave. Using techniques similar to those 
leading to the microcanonical contraction formula, it can indeed be proved that the 
canonical free energy ip((3) is given by the set Ep through the formula 

cp(/3) = M{/3h(m) - s(m)} = - sup G fi (m) = -Gp{Ep). (21) 

m m 

Therefore, if we were able to obtain E u from Ep, we would be in a position to obtain s(u) 
from <p(/3). But we know that this is not possible when s(u) is nonconcave. Hence, Ep 
cannot be the same as E u in general. 

This result may be surprising but does not contradict the theory of Lagrange 
multipliers. What this theory actually says is that the global maximizers of s(m) subject 
to the constraint h(m) = u are contained in the set of critical points of Gp(m), which 
include the global maximizers of Gp(m), but also any local maximizers, minimizers and 
saddle-points that this function may have. The theory simply does not say what the 
constrained maximizers of s(m) correspond to at the level of Gp(m). To obtain this 
information, one must go deeper into the structure of the microcanonical contraction 



formula, Eq. (17), and its canonical counterpart, Eq. (21), to find the following (see (l7|): 

(i) If s is nonconcave at u, then the elements of E u correspond either to local minima 
of Gp{m) or saddle-points of this function depending on the local curvature of s{u). 
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(ii) If s is concave at u, then the elements of S u are also elements of Ep for some /3 G R, 
which is consistent with the fact that s = y?* in this case. 

We reach two conclusions from these results. The first is that the microcanonical 
and canonical ensembles are equivalent at the level of thermodynamic properties and 
equilibrium values of macrostates when s(u) is concave 18 , 19 21 26 . The second is 
more pragmatic: It is possible to obtain s(u) from the knowledge of the critical points of 
Gp(m), but we must consider all the critical points of this function, not just its global 
maximizers 16 . We must locate, in particular, the local maxima of Gp(m), which 
corresponds physically to metastable values of Mjv in the canonical ensemble, as well as 
saddle-points of Gp(m), which correspond to unstable values of in the same ensemble. 
This conclusion is put to use in the next example. 

Example 4.1. For the block-spin model, the function Gp(m) has the simple form 

Gp(y,p) = s-(p)-(3~h(y,p), ye {-1,1}, p6[0,l], (22) 

where s(p) and h(y,p) are the macrostate entropy and energy representation function, 
respectively, introduced in Example 3^ The calculation of the critical points of this 
form of Gp(y,p) can be found in [22 as well as in Sec. 5.1 of 21 . For the purpose of 
this section, there are two points to note about this solution: 

(i) Gp(y,p) has no saddle-points, but has local maxima, i.e., metastable states, for all 
P G R; 

(ii) If we denote the set of metastable points of Gp(y,p) for a given (5 by M.p, then 



s(u) = s(Mp) for u = h(Mp) E 



.1 11 

2' 2i 



The last point demonstrates that the metastable states of Gp(y,p) can be used, as claimed, 
to recover nonconcave points of s(u). In fact, for this particular model, M.p recovers 
the whole nonconcave region of s(u). The set £p of stable or equilibrium macrostates 
recovers only the concave parts of s(u). 

The nonconcave points of s(u) can also be related, at the thermodynamic level, to 
metastable branches of the canonical free energy function ip(/3) rather than metastable 
states of the canonical ensemble, as was done above. This is illustrated next. 

Example 4.2. The exact partition function shown in Eq. ^ can be put in the form 

Zs{p) = Z$>{p) + Z®{0), (23) 

where 

Z${p) = e^V + e"T /2 , 4 2) (/3) = e-^V + *~ P ) m - (24) 

From these two partition functions, it is natural to define two free energy functions, 
(fi(/3) and ^(/S), using the definition of the free energy shown in Eq. Q: 

<Pi(P) = ^ln(2cosh/3), V2 (f3) = i/3 - i ln(2 cosh/3). (25) 
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P 

Figure 3. (Color online) Left: Plot of <£>i(/3) (blue), f2{P) (purple), and <p(f3) (dashed). 
The branches of <pi(f3) and <P2(P) that lie above <p((3) are metastable branches of the free 
energy. Right: The complete nonconcave entropy is recovered by taking the Legendre 
transform of <pi(J3) and y>2(/3). 



The relation between these two free energies and <p{/3) is easily found by using the 
expression of Eq. (23) in the limit defining ip(/3) to find 

<p{p) = inf{^C8), vM) = < ™T { p ; ° (26) 




This result is illustrated in Fig.[3j The branches of <pi(/3) and <f2(/3) that do not contribute 
to <p(/3) can be interpreted as metastable branches of <p((3), since they continue, in the 
sense of analytical continuation, the two 'stable' branches of <f(/3) while remaining above 
the 'true' minimal equilibrium free energy As was the case for the metastable 

states of Gp(y,p) studied in the previous example, these metastable branches of <p(/3) 
completely determine s(u) by Legendre transform: 

s(u) = sup ^ U = < l ( x r , 27 

y (pl[u) u e [0, lj. 

This result is also illustrated in Fig. |3j and is proved directly by calculating the Legendre 
transforms of <fi(f3) and v5 2 (/3)- 

The idea of analytically continuing the free energy around a phase transition point 
to characterize metastable states was studied for some time in the context of short-range 



models 36-39 . However, it was somewhat abandoned after it was realized that continued 
free energies do not provide correct estimates for the lifetime of metastable states. With 
hindsight, one could argue that these estimates were wrong because metastables states 
of short-range systems do not persist in the thermodynamic limit; they arise because of 
surface effects or, more precisely, because of a "sub-bulk" nonconcavity of the entropy, 
which, as mentioned, disappears in the thermodynamic or "bulk" limit. For this reason, 
metastable states of short-range systems cannot be associated with metastable branches 
of the free energy because, if they were, then the entropy would have to be nonconcave. 

+ Recall that equilibrium states of the canonical ensemble correspond, according to Gibbs, to those 
states minimizing the free energy. 
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For long-range systems, the situation is different, since these can have states that 
are truly metastable in the thermodynamic sense, and are associated with metastable 
branches of <p(f3), as illustrated by the previous examples. The same phenomenon has 



also been studied in the context of gravitating systems; see, e.g., 40 for a recent review. 
Still, one must be careful: It is known that different entropies having the same concave 
envelope lead, by Legendre transform, to the same <p(/3), so it is not possible to uniquely 
determine s(u) by analytically continuing (p(/3). The existence of metastable branches of 
(f((3) must be determined, ultimately, by calculating Zn(/3), as was done in the previous 
example. 

5. Generalized canonical ensembles 

The use of generalized ensembles to obtain nonconcave entropies was extensively discussed 



in previous publications (see 17,41-44]), so we will be brief here. The idea of this method 
is to obtain s(u) from the Legendre transform of a modified or generalized free energy 
function having the form 

<p B (P)= lim --^]nZ N , g (/3), (28) 

N— >oo iv 

where 

ZnM= [ e-f> s »V)-»<>V I »V)l ir > do; (29) 

J A N 

is the generalized partition function. In these expressions, g is a function of the mean 
energy H^/N, assumed to be continuous. Different choices for this function determine 
different generalized canonical ensembles that can be used, under some conditions on g 
(see below), to obtain s(u) even when this function is nonconcave. 

In the following, we will consider two generalized ensembles corresponding to two 
choices of g, and will show that both of these ensembles recover the nonconcave entropy 
of the block-spin model, but that one is more effective than the other for this purpose. 



The general result at play behind these two ensembles was proved in 41,42 and can 
be stated in a simple form as follows: If, for a given choice of function g, <p g {f3) is 
differentiable at /3, then 

s(u)=0u-<p g (0)+g(u) (30) 

for u = <p' g ((3)- This modified Legendre transform, which is written in short as s = ip* + g, 



generalizes the standard Legendre transform shown in Eq. (13) in an obvious way. In 
particular, if we are able to find a function g such that y? 9 (/3) is everywhere differentiable 
and the range of H N /N coincides with the image of <p' g (f3), then s = (p* + g for all u in 
the domain of s(u). We will see next that such a function can be constructed in some 
appropriate limit. 
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5.1. Gaussian ensemble 



The choice g(u) = ju /2 with 7 e K in Eq. (29) leads to a generalized ensemble known 



as the Gaussian ensemble. This ensemble was first introduced in the context of Monte 
Carlo simulations by Hetherington (45l|46), who also discussed its physical interpretation 



in terms of finite-size heat baths (see 47 -49J), and was later re- investigated in the context 



of nonconcave entropies in 41,421. It has been applied so far to obtain the nonconcave 



entropies of two spin models, namely, the mean-field Potts model 44 , and mean-field 
Blume- Emery- Griffitths model [50]. We apply it next to the block-spin model. 

Example 5.1. The first natural step to take in trying to calculate the generalized 



partition (29) with g(u) = 7« /2 is to use the Gaussian integral 



~W/2 = h / e -7* 2 /2-i7n* dt; (31) 



valid for 7 > 0, to obtain 



z n„(P) = \l^z I dt e " 7 ™ 2/2 + iT*) (32) 



/ 7 7V 
27 

for the Gaussian partition function. Unfortunately, the resulting integral over t cannot 
be evaluated in general, and in particular not for the block-spin model, despite the 
simplicity of this model. This point will be discussed in more detail in the concluding 
section. A similar integral can be obtained for 7 < 0, and this one can actually be 
evaluated using a saddle-point approximation, but, as will be discussed below, the case 
7 < is not useful for obtaining nonconcave entropies. 

It is possible in the end to calculate the Gaussian free energy <p-y(/3) of the block-spin 



model by generalizing the macrostate representation of (p(/3) found in Eq. (21) to the 
Gaussian ensemble: 

<p°(P) = inf {(3h(y,p) + U(y,p) 2 - ~s(p)\ . (33) 

' y,p I A ) 

The minimization problem involved in this expression is easily solved. For 7 > and 
(3 < 0, the expression between the the curly brackets above is globally minimized for 
y — 1 and p solving 

(3 + 1P -s'(p) = 0. (34) 

For 7 > and /3 > 0, on the other hand, the same expression is globally minimized for 
y = —1 and p solving 

f3 + 1 (p-l)-~s'(p) = 0. (35) 

Both equations for p are transcendental equations that can be solved numerically to 
obtain ip^((3). The result of this numerical calculation is shown in Fig. [4| As can be seen, 
the Gaussian free energy (p^((3) obtained for 7 > retains the nondifferentiable point of 
<p{/3) = (Po((3) at (3 = 0, but tends to become "less nondifferentiable" when 7 increases, 
as its left- and right-derivatives approach for increasing 7. This behavior of (p^(/3) is 
illustrated in the lower-left panel of Fig. |4j which shows the plot of up a = dp<p®{fi) for 
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Figure 4. (Color online) Gaussian ensemble. Upper left: Gaussian free energy if?(j3) 
of the block-spin model for different values of 7 (see right). Lower left: /3-derivative 
of ip®{fi). Right: Quadratic Legendre transform of the Gaussian free energy, which 
recovers the nonconcave entropy s(u) (dashed line) as 7 — ¥ 00. The entropy is recovered 
precisely where ip 1 ? is differentiable. 



increasing values of 7. From this plot, we immediately see that the Legendre transform 
of the Gaussian ensemble, which takes the form 

s(up r/ ) = /3up, y + -u 2 p^- ip^(/3), (36) 

should recover more and more nonconcave points of s(u) as 7 increases. This is confirmed 
by the right-hand side plot of Fig. lij which shows the part of the entropy s(u) recovered 



by Eq. (36) for 7 = (canonical ensemble), 7 = 5, 7 = 10, and 7 = 15. 

The Gaussian ensemble is an interesting statistical ensemble not only because it can 
be used to recover nonconcave entropies, as illustrated above, but also because it allows 
for a natural "parabolic" generalization of the concept of supporting lines discussed in 
Sec. [2j Because of the quadratic nature of the function g defining this ensemble, it can 
indeed be proved (see (4ll|42]) that s(u) is given by the "quadratic" Legendre transform 



of <£> 7 (/3) shown in Eq. (36) if 

s(v)<s(u) + f3(v-u) + ^(v-u) 2 (37) 

for all v . We say in this case that s admits a supporting parabola with curvature 7 at 
the point u; see Fig. [5j Therefore, the points of s(u) that are recovered by the Gaussian 
ensemble with parameter 7 are all (and only) those points that admit a supporting 
parabola with curvature 7 or, equivalently, all the points of s(u) coinciding with the 
parabolic hull of this function; see Fig. |5j 
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Figure 5. (Color online) Left: Illustration of the concept of supporting parabola. 
Center: Supporting parabolas lying at the center of s(u). The 7 values of these 
parabolas are those reported in Fig. [4] The entropy at u — is recovered only in the 
limit where 7 — > 00 because s(u) has a corner at u = 0. Right: The parabolic envelope 
of s(u) is given by the set of all supporting parabolas with given curvature 7. 



We see from this result that the entropy of the block-spin model is obtained only in 
the limit 7—7-00 because the entropy of this model has a cusp or "corner" at u, which 
can only be "supported" by a degenerate parabola of infinite curvature, as shown in the 
center plot of Fig. [5] The same result also explains why the Gaussian ensemble is able to 
recover nonconcave points of s(u): By modifying the Legendre transform with an added 
quadratic term, we are able to "reach" with a supporting parabola points of s(u) that 
cannot be "reached" with a supporting line; see Fig. |5j This implies, naturally, that the 
Gaussian ensemble with 7 > can only recover more points of s(u) as compared with the 
canonical ensemble, at least if s(u) has a nonconcave region (and is nondegenerate). (Of 
course, if s(u) is concave, then the Gaussian ensemble with 7 > necessarily recovers the 
whole of s(u) just as the canonical ensemble does.) Conversely, the Gaussian ensemble 
with 7 < must recover fewer points of s(u) than the canonical ensemble because 
points supported by a supporting line may not be supported by a parabola with inverted 
curvature. This explains our previous observation that the Gaussian ensemble with 
7 < is not useful for obtaining nonconcave entropies. 



5.2. Betrag ensemble 

We now consider a different ensemble defined by the choice g(u) = 7H with 7 £ 
which will be referred to as the Betrag ensemble^ This ensemble was mentioned in 



44 



and is somewhat related to piecewise linear Legendre transforms 51 , but was never 
applied before to any equilibrium models. 

Example 5.2. The partition function 

JA N 

associated with the Betrag ensemble can easily be calculated for the block-spin model 
because the energy of this model is positive when y = 1 and negative when y = — 1. The 

* This ensemble could also be called the "absolute value ensemble" , but German seems to provide a 
better name. 
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Figure 6. (Color online) Betrag ensemble. Upper left: Betrag free energy (p?((3) of the 
block-spin model for different values of 7 (see right). Lower left: Derivative of ip^(/3). 
Right: Deformed Legendre transform of the Betrag free energy, which recovers the 
nonconcave entropy s(u) (dashed line) as 7 — > 00. The entropy is recovered precisely 
where <p^ is differentiable. 



term \H N \ is easily separable, as a result, and we obtain 

^,(/3) = 4 1) (/3-7) + 4 2) (/3 + 7), (39) 



where Z^\f3) and zff(f3) are the two canonical partition functions defined in Eq. (24). 
Given the free energies y?i(/3) and <^ 2 (/5) shown in Eq. (25), we therefore obtain 

<Pi(P-j) /3>0 



r u (/>) .h>i{^.i--). r ->,r> - ■-.)}■ { Vl ? a , 7 ! ttl (40) 



for the free energy of the Betrag ensemble. 

This free energy function is shown in Fig. [6} As for the Gaussian free energy, we 
see that (p~{0) has a nondifferentiable point at /3 — for all the values of 7 considered, 
but that the image of the derivative of (p^((3), which we denote by Up i7 in the lower-left 
plot of Fig. [6j fills more and more points of the interval [—1,1] as 7 is increased. These 
properties of <p?(l3) were also observed for the Gaussian free energy and imply 

that the modified Legendre transform of the Betrag ensemble, given bj|J] 

s(u Pn ) = pu Pn + 7l«j8, 7 l - tfiP), ( 41 ) 

recovers more and more points of s(u) as 7 is increased. This is illustrated in the 
right-hand side plot of Fig. [6] 



See the general result shown in Eq. ( 30 ) . 
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Figure 7. (Color online). Comparison of u 7 for the Gaussian ensemble (blue line) and 
Betrag ensemble (purple line). 



As for the Gaussian ensemble, one can show that the Betrag ensemble recovers 
the full entropy only in the limit 7 — > 00. The comparison of Figs. [4] and [6] shows, 
however, that the Betrag ensemble is more efficient at obtaining the full entropy than 
the Gaussian ensemble. Indeed, both ensembles recover s(u) over an interval of the form 
J 7 = [—1, — u 7 ] U [u^, 1], but m 7 converges to as 7 — > 00 faster for the Betrag ensemble 
than for the Gaussian ensemble; see Fig. [7| This difference in convergence is related 
to the way the two ensembles achieve equivalence El]: For the Gaussian ensemble, the 
limit 7 — > 00 is required to recover the whole of s(u) because, as already noted, s(u) 
has a cusp at u — 0, whereas for the Betrag ensemble, the limit is needed because s'{u) 
diverges around its cusp, ft 

The possibility of applying the Betrag ensemble to other models rests on being able to 
separate the partition function of this ensemble into two sums: one involving microstates 
having a positive energy, and one involving the complementary set of microstates having 
a negative energy. Such a separation is easily achieved for the block-spin model because 
of the structure of its Hamiltonian, but one cannot be so optimistic, of course, as to 
assume that this sort of partitioning trick can be achieved for more realistic models; it 
all depends on the form of the Hamiltonian that one considers. 

In addition to this consideration, it should be clear that, if the nonconcave region of 
s(u) is located in a region of positive energy, then the Betrag ensemble will not be able 
to recover or "express" any nonconcave points of s(u). In this case, one must replace 
the function g{u) — j\u\ by g(u) — j\u — Uo\, where uq is some fixed value of the mean 
energy located inside the nonconcave region of s(u). In practice, this means that in order 
to use the Betrag ensemble in any useful way, one must have some prior information 
about where s(u) is nonconcave in order to choose the right uq. 

ff This implies, in particular, that if s{u) has a cusp with finite left- and right-derivatives, then the 
Betrag ensemble achieves equivalence for a finite value 7 > 0, whereas the Gaussian ensemble still 
requires the limit 7 — » 00. 
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6. Restricted canonical ensemble 

The concept of restricted canonical ensemble or restricted partition function was developed 



by Penrose and Lebowitz 52 not for calculating nonconcave entropies, but as a way 
to study metastable states in the canonical ensemble. However, from the discussion of 
these two topics found in Sec. |4j one should expect that restricted partition functions 
may also be useful for obtaining nonconcave entropies. 

The idea behind restricted partition functions is, as the name suggests, to restrict 
the sum over all microstates u G defining Zn(/3) to a subset of which will be 
denoted by Rn- Thus instead of calculating Z(j3), one attempts to calculate 

Z%{fl)= [ e-* (u) dw. (42) 

JR N 

The choice of Rn is determined by the fact that, when s(u) is concave, the sum over 
A^v in the standard partition function Z{jS) is dominated in the thermodynamic limit 
(N — > oo) by a subset of microstates of A^v corresponding to the equilibrium states of 
the canonical ensemble having a fixed energy. However, when s(u) is nonconcave, there 
is a further important - yet subdominant - contribution to the sum of Z(/3) coming from 
metastable or unstable states of the canonical ensemble. If one chooses Rn = A at, then 
only the dominant equilibrium states will contribute in the partition function. But if one 
selects Rn so as to exclude the dominant states, then Z§((3) will be dominated by the 
metastable or unstable states. In this case one should be able to recover the nonconcave 
points of s(u), or at least some of them, by taking the Legendre transform of the free 
energy function <p R ({3) associated with Z§((3). The next example shows how this works 
in practice. 

Example 6.1. We have seen with the Betrag ensemble that the microstate space An of 
the block-spin model can be partitioned, with respect to Hn, into microstates of positive 
and negative energy. Let us use this partition to define a restricted partition function 
Z N W) by summing only over the microstates having a positive energy: 

zM= E e ~ PHN - ( 43 ) 

y=l,a ly ...,a N 



Going back to the example 4.2 , it is easy to see that Z N (j3) is nothing but the "metastable^ 
partition function Z N defined in Eq. (24). Therefore, 



V \{u) = V * 2 {u)= l -s a {2u~l) (44) 

for u G [0, 1]. A similar result can be derived for u G [—1, 0] by calculating a restricted 
partition function Z N (/3) for the microstates having a negative energy. In this case, 
Z N (f3) =Z${p) and so 

^_{u) = V \{u)= l -s a {2u + l) (45) 

for u G [—1, 0]. Hence, although neither of the restricted partition functions recovers the 
whole of s(u), their combination does. 
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The difficulty of working with restricted partition functions is similar to that of 
working with the Betrag ensemble: in both cases, one must be able to calculate a 
partition function over some restricted set of microstates. Whether this can be done 
in practice depends on the Hamiltonian Hn considered and, more precisely, on the 
possibility to use symmetries of this Hamiltonian to partition Ajv in easily-definable 
sets of microstates. The block-spin model has such a symmetry, as we have seen, which 
allows for a straightforward calculation of Z^{0) and Z~^((3), as well as Z^(/3), the 
betrag partition function. In fact, for this model, the functions Z^iP) and Z^(/3) merely 
re-create Z^((3) but in two separate partition functions instead of one. 

Of course, if admits an energy representation function and macrostate entropy 
function, then restricted partition functions can be calculated for many conceivable 
restrictions of A^r simply by restricting the integrals over the macrostate that result 
from the macrostate representation. In this context, the restriction method can be seen 
as a way to locate the critical points of the function Gp(m), considered in Sec. [2J by 
restricting the range of values allowed for M^. 

Finally, note that in the extreme case where the sum-over-states of the standard 
partition function Z N (f3) is restricted only to microstates of constant energy u, the 
resulting restricted free energy function is necessarily equal to the entropy, up to some 
constant. This follows, of course, because the microcanonical ensemble is a special 
restricted ensemble that considers only microstates with a constant energy. 

7. Inverse Laplace transform 

The last method that we discuss has been introduced recently in [53]. Its basis is the 
inverse Laplace transform that expresses the density of state Qn(u) in terms of the 
partition function Zn{(3)~- 



This integral is a complex integral along the path or contour Re(/3) = r, often referred to 
as the Bromwich contour. The value of r used to position this contour must be chosen 
in the region of convergence of Zjv(/3), but is otherwise arbitrary. 

Since the inverse Laplace transform expresses the density of states exactly in terms 
of the partition function, it can be used, obviously, to obtain s(u) from Zjv(/3) even if the 
former is nonconcave. In fact, it is known that, since the entropy is a thermodynamic-limit 
function, one needs to know in general only the asymptotic form of Zn{($) as A" — > oo to 
obtain s(u) via the inverse Laplace transform. One has to be careful, however, to retain 
the dominant and subdominant terms of Zn(/3) when performing any approximations of 
the Bromwich integral. If one retains only the dominant term, then only the concave 
envelope of s(u) is recovered, in accordance with our discussion of metastable branches 
of <p(0) (see Sees. |l]and|6]). This point is illustrated next. 

Example 7.1. Given <p(/3), we can approximate the partition function as Zn((3) ~ 




(46) 
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e -N(p(P)^ pi U g this approximation into the integral of the inverse Laplace transform of 
Eq. (46), and then naively approximate the resulting integral by its saddlepoint (see, 
e.g., Appendix C.l of [26]) to obtain 

n N (u) « e^pi^-my = (47) 

and so s(u) = f*(u). This result is correct, as we know, if s(u) is concave, but not if 
s(u) is nonconcave; see Sec. [2] and especially Eq. (J5|. 

A better approximation for Zn(/3) is suggested by Eqs. (24) and (25): 

Z N {p) ~ e~ Nipm + e - Nv *W m (48) 



Using this expression in Eq. (46), we obtain 

n N (u) « + e Nv *W (49) 

instead of Eq. ( |i7j ), so that 

s(u) = sup{ipl(u), ipl(u)}. (50) 



We know from Eq. (27) that this last formula recovers the correct entropy. Therefore, in 
the case of the block-spin model, the approximation shown in Eq. (49) is sufficient to 
obtain s(u). 

The previous example can be generalized to any model whose partition function 
can be put in the form 

Z N (P) = Y^c Nd {p)e- N, "V>, (51) 
5 

where <Pj((3) are concave and smooth functions of (3 that do not depend on N, and 
c N.j{0) are functions of (3 that are sub-exponential in N. In (53j it is shown that if these 
assumptions are satisfied and the coefficients Cj^ifl) have no poles in the /3-complex 
plane, then s(u) is given by a direct generalization of Eq. (50) involving the "metastable" 
free energies <Pj{0)- However, if any of these coefficients have poles in j3, then s(u) 
is given by a more complicated formula involving the ^-(/3)'s as well as the poles of 
c N,j(fl)- The surprising effect of these poles is that they determine the presence of linear 
branches in the graph of s(u), which arise in short-range systems having first-order phase 



transitions. For more details on these results, the reader is referred again to 53 



8. Comments and open problems 

The examples given in the previous sections are very simple, but provide nevertheless 
a useful guide as to how the different methods that we have covered in this paper 
can be applied in practice to obtain the nonconcave entropy of more realistic models. 
They provide, in particular, a good illustration of the properties that one should look 
for when selecting the right method to use. On the one hand, if the Hamiltonian 
considered has any symmetries that can be used to partition the microstate space 
in easily-definable regions having different energies, then it may be possible to obtain 
s(u) using the Betrag ensemble or the restricted canonical ensemble. On the other hand, 
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if the Hamiltonian admits a macrostate representation, discussed in Sec. [3| then the 
micro canonical contraction formula or its canonical version involving Gp(m) will provide 
a more direct way to obtain s(u), although all the other methods can also be used in 
this case, since they all admit a macrostate representation. 

If none of these cases apply, then the application of any of the methods discussed 
here is likely to lead to difficult or even untractable calculations. This should hardly 
come as a surprise: after all, the calculation of the standard partition function Z^{f3) 
is known to be a difficult problem in general, and so must be the calculation of any 
generalizations of Z^((3). In fact, if one cannot analytically calculate Z^((3) for a given 
model, then it is very unlikely that one will be able to analytical calculate any of the 
generalized partition functions described before. In this case, one may have to resort to 
approximation methods or numerical methods, such as Monte Carlo methods based on 



generalized ensembles (see, e.g., 45-49,54]). 

To conclude this paper, we present next a short list of open problems related to the 
generalized canonical ensembles discussed in Sec. [5j The first problem is relevant for 
practical calculations in the Gaussian ensemble, whereas the second is concerned with 
the numerical implementation of generalized ensembles. The last two problems point to 
some interesting connections with convex analysis. 



Gaussian integral for the Gaussian ensemble: Study the integral of Eq. (32), which 
expresses the Gaussian partition function in terms of a complex transform of the 
standard partition function, in order to see if this integral can be approximated in 
any useful way. We have already commented on the fact this integral cannot be 
solved for the block-spin model, and is unlikely to be computable in general. The 
reason for this is that the integrand e~^ Nt2 / 2 Z^{f3 + vyt) is highly oscillatory when 
7 > 0, which prevents one from performing any form of saddle-point approximation. 
Other types of approximation may be possible, however. 

Generalized canonical ensembles and multicanonical simulation methods: There is a 
strong suggestion that the generalized canonical ensembles discussed in Sec. [5] are 
related to a set of numerical methods known collectively as multicanonical methods 



or umbrella sampling methods (see, e.g., 55-61]). The exact connection, however, 
has yet to be made explicit. 

Physical interpretation of generalized ensembles: We mentioned in Sec. [5] that the 
Gaussian ensemble can be interpreted physically as a statistical-mechanical ensemble 
describing a sample system coupled to a finite-size heat bath (as opposed to the 
canonical ensemble which describes a sample system coupled to an infinite-size heat 
bath). Is there a similar physical interpretation for the Betrag ensemble? Are there 
other generalized ensembles for which a physical interpretation or "realization" can 
be found or constructed? 

Supporting functions for generalized canonical ensembles: We have seen that the 
concept of supporting lines, which provides a geometrical interpretation of the 
Legendre transform, is generalized in the Gaussian ensemble to the concept of 
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supporting parabolas. It is not known whether other generalized ensembles admit 
a similar notion of supporting function. One may ask, for example, whether the 
supporting function of the Betrag ensemble is the "absolute value" function. More 
generally, are there other types of supporting functions for other choices of g(u)7 

Moreau transforms: The quadratic Legendre transform of the Gaussian ensemble, 
defined in Eq. ( 36 ) , appears to be related to a functional transform known in convex 
analysis as the Moreau transform 62 . Are there any known properties of the latter 
transform that could be used to simplify calculations in the Gaussian ensemble? 
Moreover, are there generalizations of the Legendre or the Moreau transform that 
could be used to define other types of generalized canonical ensembles? 
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